Overview

Brought to you by YData

Dataset statistics

Number of variables29
Number of observations273074
Missing cells567713
Missing cells (%)7.2%
Total size in memory58.6 MiB
Average record size in memory225.0 B

Variable types

Text17
Numeric11
Boolean1

Alerts

CITY has constant value "New York City"Constant
STATE has constant value "New York"Constant
COUNTRY has constant value "United States"Constant
IS_BAD_USER is highly imbalanced (99.9%)Imbalance
SUMMARY has 26668 (9.8%) missing valuesMissing
EXPERTISE has 95382 (34.9%) missing valuesMissing
CURRENTINDUSTRY has 3625 (1.3%) missing valuesMissing
PUB_NAME has 8047 (2.9%) missing valuesMissing
PUB_DATE has 57936 (21.2%) missing valuesMissing
PUB_DESCRIPTION has 149852 (54.9%) missing valuesMissing
AWARD_NAME has 6483 (2.4%) missing valuesMissing
AWARD_COMPANY has 31679 (11.6%) missing valuesMissing
AWARD_DESCRIPTION has 141832 (51.9%) missing valuesMissing
AWARD_DATE has 46076 (16.9%) missing valuesMissing
F_PROB has 26125 (9.6%) zerosZeros
M_PROB has 27626 (10.1%) zerosZeros
WHITE_PROB has 6465 (2.4%) zerosZeros
BLACK_PROB has 40611 (14.9%) zerosZeros
API_PROB has 26816 (9.8%) zerosZeros
HISPANIC_PROB has 30085 (11.0%) zerosZeros
NATIVE_PROB has 97216 (35.6%) zerosZeros
MULTIPLE_PROB has 59816 (21.9%) zerosZeros

Reproduction

Analysis started2025-09-30 07:18:24.794190
Analysis finished2025-09-30 07:18:32.778887
Duration7.98 seconds
Software versionydata-profiling vv4.17.0
Download configurationconfig.json

Variables

TITLE
Text

Distinct4596
Distinct (%)1.7%
Missing133
Missing (%)< 0.1%
Memory size2.1 MiB
2025-09-30T03:18:33.370616image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length235
Median length199
Mean length82.19057599
Min length1

Characters and Unicode

Total characters22433178
Distinct characters161
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2116 ?
Unique (%)0.8%

Sample

1st rowChief of Chime Enterprise | Entrepreneur, Businessman
2nd rowChief of Chime Enterprise | Entrepreneur, Businessman
3rd rowChief of Chime Enterprise | Entrepreneur, Businessman
4th rowChief of Chime Enterprise | Entrepreneur, Businessman
5th rowChief of Chime Enterprise | Entrepreneur, Businessman
ValueCountFrequency (%)
309737
 
9.7%
at118357
 
3.7%
of108636
 
3.4%
and73114
 
2.3%
professor44426
 
1.4%
the33865
 
1.1%
director26167
 
0.8%
president25099
 
0.8%
university24357
 
0.8%
global21255
 
0.7%
Other values (6980)2395393
75.3%
2025-09-30T03:18:34.151779image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2940729
 
13.1%
e1910629
 
8.5%
i1433952
 
6.4%
r1414287
 
6.3%
a1366682
 
6.1%
t1363986
 
6.1%
n1316579
 
5.9%
o1212429
 
5.4%
s923066
 
4.1%
l650294
 
2.9%
Other values (151)7900545
35.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)22433178
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2940729
 
13.1%
e1910629
 
8.5%
i1433952
 
6.4%
r1414287
 
6.3%
a1366682
 
6.1%
t1363986
 
6.1%
n1316579
 
5.9%
o1212429
 
5.4%
s923066
 
4.1%
l650294
 
2.9%
Other values (151)7900545
35.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)22433178
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2940729
 
13.1%
e1910629
 
8.5%
i1433952
 
6.4%
r1414287
 
6.3%
a1366682
 
6.1%
t1363986
 
6.1%
n1316579
 
5.9%
o1212429
 
5.4%
s923066
 
4.1%
l650294
 
2.9%
Other values (151)7900545
35.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)22433178
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2940729
 
13.1%
e1910629
 
8.5%
i1433952
 
6.4%
r1414287
 
6.3%
a1366682
 
6.1%
t1363986
 
6.1%
n1316579
 
5.9%
o1212429
 
5.4%
s923066
 
4.1%
l650294
 
2.9%
Other values (151)7900545
35.2%
Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:34.387804image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length46
Median length33
Mean length32.53649194
Min length12

Characters and Unicode

Total characters8884870
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNew York, New York, United States
2nd rowNew York, New York, United States
3rd rowNew York, New York, United States
4th rowNew York, New York, United States
5th rowNew York, New York, United States
ValueCountFrequency (%)
new448002
29.1%
york448002
29.1%
united192085
12.5%
states192085
12.5%
city98054
 
6.4%
metropolitan97980
 
6.4%
area60957
 
4.0%
ny92
 
< 0.1%
greater70
 
< 0.1%
2025-09-30T03:18:34.786666image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1264253
14.2%
e991249
11.2%
t870339
 
9.8%
o643962
 
7.2%
r607079
 
6.8%
N448094
 
5.0%
Y448094
 
5.0%
k448002
 
5.0%
w448002
 
5.0%
i388119
 
4.4%
Other values (14)2327677
26.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)8884870
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1264253
14.2%
e991249
11.2%
t870339
 
9.8%
o643962
 
7.2%
r607079
 
6.8%
N448094
 
5.0%
Y448094
 
5.0%
k448002
 
5.0%
w448002
 
5.0%
i388119
 
4.4%
Other values (14)2327677
26.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)8884870
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1264253
14.2%
e991249
11.2%
t870339
 
9.8%
o643962
 
7.2%
r607079
 
6.8%
N448094
 
5.0%
Y448094
 
5.0%
k448002
 
5.0%
w448002
 
5.0%
i388119
 
4.4%
Other values (14)2327677
26.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)8884870
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1264253
14.2%
e991249
11.2%
t870339
 
9.8%
o643962
 
7.2%
r607079
 
6.8%
N448094
 
5.0%
Y448094
 
5.0%
k448002
 
5.0%
w448002
 
5.0%
i388119
 
4.4%
Other values (14)2327677
26.2%

SUMMARY
Text

Missing 

Distinct3019
Distinct (%)1.2%
Missing26668
Missing (%)9.8%
Memory size2.1 MiB
2025-09-30T03:18:35.263769image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length3033
Median length2510
Mean length1336.236285
Min length1

Characters and Unicode

Total characters329256638
Distinct characters183
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1088 ?
Unique (%)0.4%

Sample

1st rowAs the Chief of Chime Enterprise, I lead a high-growth business unit delivering next-generation financial wellness solutions to many of the nation’s largest employers and their workforces. My work is grounded in a simple but urgent belief: the current financial system is fragmented and inequitable—and it’s not built for the way people earn and live today. At Chime, we’re building solutions that give workers better access to their money and a stronger path to long-term financial well-being. For employers, that means deeper engagement, better retention, and healthier teams overall. Before Chime, I spent nearly two decades advising and building solutions for some of the world’s most powerful financial institutions, then went on to found and scale several fintech companies designed to close structural gaps in the system. I’ve seen how broken the infrastructure is—and I’m committed to reshaping it so it works for everyone. More on my work and what drives it: https://jason-lee.co
2nd rowAs the Chief of Chime Enterprise, I lead a high-growth business unit delivering next-generation financial wellness solutions to many of the nation’s largest employers and their workforces. My work is grounded in a simple but urgent belief: the current financial system is fragmented and inequitable—and it’s not built for the way people earn and live today. At Chime, we’re building solutions that give workers better access to their money and a stronger path to long-term financial well-being. For employers, that means deeper engagement, better retention, and healthier teams overall. Before Chime, I spent nearly two decades advising and building solutions for some of the world’s most powerful financial institutions, then went on to found and scale several fintech companies designed to close structural gaps in the system. I’ve seen how broken the infrastructure is—and I’m committed to reshaping it so it works for everyone. More on my work and what drives it: https://jason-lee.co
3rd rowAs the Chief of Chime Enterprise, I lead a high-growth business unit delivering next-generation financial wellness solutions to many of the nation’s largest employers and their workforces. My work is grounded in a simple but urgent belief: the current financial system is fragmented and inequitable—and it’s not built for the way people earn and live today. At Chime, we’re building solutions that give workers better access to their money and a stronger path to long-term financial well-being. For employers, that means deeper engagement, better retention, and healthier teams overall. Before Chime, I spent nearly two decades advising and building solutions for some of the world’s most powerful financial institutions, then went on to found and scale several fintech companies designed to close structural gaps in the system. I’ve seen how broken the infrastructure is—and I’m committed to reshaping it so it works for everyone. More on my work and what drives it: https://jason-lee.co
4th rowAs the Chief of Chime Enterprise, I lead a high-growth business unit delivering next-generation financial wellness solutions to many of the nation’s largest employers and their workforces. My work is grounded in a simple but urgent belief: the current financial system is fragmented and inequitable—and it’s not built for the way people earn and live today. At Chime, we’re building solutions that give workers better access to their money and a stronger path to long-term financial well-being. For employers, that means deeper engagement, better retention, and healthier teams overall. Before Chime, I spent nearly two decades advising and building solutions for some of the world’s most powerful financial institutions, then went on to found and scale several fintech companies designed to close structural gaps in the system. I’ve seen how broken the infrastructure is—and I’m committed to reshaping it so it works for everyone. More on my work and what drives it: https://jason-lee.co
5th rowAs the Chief of Chime Enterprise, I lead a high-growth business unit delivering next-generation financial wellness solutions to many of the nation’s largest employers and their workforces. My work is grounded in a simple but urgent belief: the current financial system is fragmented and inequitable—and it’s not built for the way people earn and live today. At Chime, we’re building solutions that give workers better access to their money and a stronger path to long-term financial well-being. For employers, that means deeper engagement, better retention, and healthier teams overall. Before Chime, I spent nearly two decades advising and building solutions for some of the world’s most powerful financial institutions, then went on to found and scale several fintech companies designed to close structural gaps in the system. I’ve seen how broken the infrastructure is—and I’m committed to reshaping it so it works for everyone. More on my work and what drives it: https://jason-lee.co
ValueCountFrequency (%)
and2624361
 
5.5%
the1618277
 
3.4%
of1343333
 
2.8%
in1147571
 
2.4%
a835440
 
1.8%
to745744
 
1.6%
i561212
 
1.2%
for517776
 
1.1%
as477944
 
1.0%
at468517
 
1.0%
Other values (28417)37197486
78.2%
2025-09-30T03:18:35.916644image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
47369024
14.4%
e29360204
 
8.9%
a22069197
 
6.7%
i21296342
 
6.5%
n21070501
 
6.4%
t19469347
 
5.9%
o17817261
 
5.4%
r16987535
 
5.2%
s16032346
 
4.9%
l10749794
 
3.3%
Other values (173)107035087
32.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)329256638
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
47369024
14.4%
e29360204
 
8.9%
a22069197
 
6.7%
i21296342
 
6.5%
n21070501
 
6.4%
t19469347
 
5.9%
o17817261
 
5.4%
r16987535
 
5.2%
s16032346
 
4.9%
l10749794
 
3.3%
Other values (173)107035087
32.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)329256638
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
47369024
14.4%
e29360204
 
8.9%
a22069197
 
6.7%
i21296342
 
6.5%
n21070501
 
6.4%
t19469347
 
5.9%
o17817261
 
5.4%
r16987535
 
5.2%
s16032346
 
4.9%
l10749794
 
3.3%
Other values (173)107035087
32.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)329256638
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
47369024
14.4%
e29360204
 
8.9%
a22069197
 
6.7%
i21296342
 
6.5%
n21070501
 
6.4%
t19469347
 
5.9%
o17817261
 
5.4%
r16987535
 
5.2%
s16032346
 
4.9%
l10749794
 
3.3%
Other values (173)107035087
32.5%

NUMCONNECTIONS
Real number (ℝ)

Distinct345
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean465.8383478
Minimum151
Maximum500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:36.112523image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum151
5-th percentile262
Q1500
median500
Q3500
95-th percentile500
Maximum500
Range349
Interquartile range (IQR)0

Descriptive statistics

Standard deviation80.29544461
Coefficient of variation (CV)0.1723676142
Kurtosis5.060259262
Mean465.8383478
Median Absolute Deviation (MAD)0
Skewness-2.456852149
Sum127208341
Variance6447.358425
MonotonicityNot monotonic
2025-09-30T03:18:36.350431image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
500215124
78.8%
4033235
 
1.2%
3213123
 
1.1%
2922330
 
0.9%
1681603
 
0.6%
4561563
 
0.6%
4941438
 
0.5%
3941238
 
0.5%
1881211
 
0.4%
4641197
 
0.4%
Other values (335)41012
 
15.0%
ValueCountFrequency (%)
15122
 
< 0.1%
1521186
0.4%
1534
 
< 0.1%
154139
 
0.1%
1554
 
< 0.1%
ValueCountFrequency (%)
500215124
78.8%
4997
 
< 0.1%
49717
 
< 0.1%
4952
 
< 0.1%
4941438
 
0.5%

F_PROB
Real number (ℝ)

Zeros 

Distinct1508
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4518314847
Minimum0
Maximum1
Zeros26125
Zeros (%)9.6%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:36.549445image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.004226263147
median0.4040863216
Q30.9969102144
95-th percentile1
Maximum1
Range1
Interquartile range (IQR)0.9926839513

Descriptive statistics

Standard deviation0.4488875153
Coefficient of variation (CV)0.9934843642
Kurtosis-1.775321881
Mean0.4518314847
Median Absolute Deviation (MAD)0.4007193241
Skewness0.1997508129
Sum123383.4309
Variance0.2015000014
MonotonicityNot monotonic
2025-09-30T03:18:36.745147image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127626
 
10.1%
026125
 
9.6%
0.510578930410759
 
3.9%
0.0037136019676329
 
2.3%
0.99711900955810
 
2.1%
0.53116440775109
 
1.9%
0.0042772246525006
 
1.8%
0.0056152753534473
 
1.6%
0.0042262631474294
 
1.6%
0.40408632163750
 
1.4%
Other values (1498)173793
63.6%
ValueCountFrequency (%)
026125
9.6%
0.0002121970901378
 
0.1%
0.0002207018262
 
< 0.1%
0.00022257835372
 
< 0.1%
0.0002333488956221
 
0.1%
ValueCountFrequency (%)
127626
10.1%
0.9998624921216
 
0.1%
0.99984157091
 
< 0.1%
0.99979966881
 
< 0.1%
0.99979567531
 
< 0.1%

M_PROB
Real number (ℝ)

Zeros 

Distinct1508
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5481685133
Minimum0
Maximum1
Zeros27626
Zeros (%)10.1%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:36.939523image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.003089785576
median0.5959136486
Q30.9957737327
95-th percentile1
Maximum1
Range1
Interquartile range (IQR)0.9926839471

Descriptive statistics

Standard deviation0.4488875156
Coefficient of variation (CV)0.8188859898
Kurtosis-1.775321887
Mean0.5481685133
Median Absolute Deviation (MAD)0.4007193446
Skewness-0.199750799
Sum149690.5686
Variance0.2015000017
MonotonicityNot monotonic
2025-09-30T03:18:37.134961image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
027626
 
10.1%
126125
 
9.6%
0.489421039810759
 
3.9%
0.99628639226329
 
2.3%
0.0028809905055810
 
2.1%
0.46883556255109
 
1.9%
0.99572277075006
 
1.8%
0.9943847064473
 
1.6%
0.99577373274294
 
1.6%
0.59591364863750
 
1.4%
Other values (1498)173793
63.6%
ValueCountFrequency (%)
027626
10.1%
0.0001375079155216
 
0.1%
0.00015842914581
 
< 0.1%
0.00020033121111
 
< 0.1%
0.00020432472231
 
< 0.1%
ValueCountFrequency (%)
126125
9.6%
0.9997878075378
 
0.1%
0.9997792842
 
< 0.1%
0.99977743632
 
< 0.1%
0.9997666478221
 
0.1%

WHITE_PROB
Real number (ℝ)

Zeros 

Distinct865
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5811596207
Minimum0
Maximum1
Zeros6465
Zeros (%)2.4%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:37.324718image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.002000000095
Q10.1120000035
median0.7409999967
Q30.9629999995
95-th percentile0.9909999967
Maximum1
Range1
Interquartile range (IQR)0.850999996

Descriptive statistics

Standard deviation0.3945790435
Coefficient of variation (CV)0.6789512372
Kurtosis-1.532550894
Mean0.5811596207
Median Absolute Deviation (MAD)0.2440000176
Skewness-0.4153614864
Sum158699.5823
Variance0.1556926216
MonotonicityNot monotonic
2025-09-30T03:18:37.517669image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.00300000002612626
 
4.6%
0.0020000000959662
 
3.5%
0.79000002157958
 
2.9%
0.98199999337775
 
2.8%
06465
 
2.4%
0.96299999955860
 
2.1%
0.20499999825263
 
1.9%
0.98699998865001
 
1.8%
0.98400002724061
 
1.5%
0.74099999673914
 
1.4%
Other values (855)204489
74.9%
ValueCountFrequency (%)
06465
2.4%
0.0010000000472989
 
1.1%
0.0020000000959662
3.5%
0.00300000002612626
4.6%
0.004000000192513
 
0.9%
ValueCountFrequency (%)
11644
0.6%
0.99900001291208
0.4%
0.99800002571914
0.7%
0.996999979652
 
0.2%
0.99599999191881
0.7%

BLACK_PROB
Real number (ℝ)

Zeros 

Distinct601
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.09922306414
Minimum0
Maximum0.9980000257
Zeros40611
Zeros (%)14.9%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:37.712025image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.001000000047
median0.0120000001
Q30.1089999974
95-th percentile0.5519999862
Maximum0.9980000257
Range0.9980000257
Interquartile range (IQR)0.1079999974

Descriptive statistics

Standard deviation0.1918493846
Coefficient of variation (CV)1.933516025
Kurtosis7.488700917
Mean0.09922306414
Median Absolute Deviation (MAD)0.0120000001
Skewness2.753265525
Sum27095.23902
Variance0.03680618637
MonotonicityNot monotonic
2025-09-30T03:18:37.905605image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.00100000004742620
 
15.6%
040611
 
14.9%
0.00899999961312993
 
4.8%
0.00300000002610698
 
3.9%
0.10899999747939
 
2.9%
0.0020000000956361
 
2.3%
0.014000000435547
 
2.0%
0.255527
 
2.0%
0.0060000000525394
 
2.0%
0.33000001314632
 
1.7%
Other values (591)130752
47.9%
ValueCountFrequency (%)
040611
14.9%
0.00100000004742620
15.6%
0.0020000000956361
 
2.3%
0.00300000002610698
 
3.9%
0.004000000194031
 
1.5%
ValueCountFrequency (%)
0.998000025713
 
< 0.1%
0.99599999191
 
< 0.1%
0.99500000482
 
< 0.1%
0.990999996721
 
< 0.1%
0.990000009555
< 0.1%

API_PROB
Real number (ℝ)

Zeros 

Distinct581
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1673660292
Minimum0
Maximum1
Zeros26816
Zeros (%)9.8%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:38.099937image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.001000000047
median0.008999999613
Q30.06599999964
95-th percentile0.9869999886
Maximum1
Range1
Interquartile range (IQR)0.06499999959

Descriptive statistics

Standard deviation0.3237927953
Coefficient of variation (CV)1.934638689
Kurtosis1.706700673
Mean0.1673660292
Median Absolute Deviation (MAD)0.008999999613
Skewness1.839422567
Sum45703.31105
Variance0.1048417743
MonotonicityNot monotonic
2025-09-30T03:18:38.294144image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.00100000004752906
19.4%
026816
 
9.8%
0.00300000002619726
 
7.2%
0.00200000009516317
 
6.0%
0.018999999399689
 
3.5%
0.014999999669258
 
3.4%
0.019999999556570
 
2.4%
15888
 
2.2%
0.98699998865503
 
2.0%
0.0049999998885347
 
2.0%
Other values (571)115054
42.1%
ValueCountFrequency (%)
026816
9.8%
0.00100000004752906
19.4%
0.00200000009516317
 
6.0%
0.00300000002619726
 
7.2%
0.004000000195191
 
1.9%
ValueCountFrequency (%)
15888
2.2%
0.9990000129206
 
0.1%
0.99800002571443
 
0.5%
0.996999979387
 
0.1%
0.9959999919409
 
0.1%

HISPANIC_PROB
Real number (ℝ)

Zeros 

Distinct591
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.12772503
Minimum0
Maximum1
Zeros30085
Zeros (%)11.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:38.485669image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.002000000095
median0.0120000001
Q30.08299999684
95-th percentile0.9079999924
Maximum1
Range1
Interquartile range (IQR)0.08099999675

Descriptive statistics

Standard deviation0.2616267557
Coefficient of variation (CV)2.048359321
Kurtosis4.490410381
Mean0.12772503
Median Absolute Deviation (MAD)0.0120000001
Skewness2.408271915
Sum34878.38485
Variance0.06844855931
MonotonicityNot monotonic
2025-09-30T03:18:38.677695image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.00100000004731087
 
11.4%
030085
 
11.0%
0.00300000002615471
 
5.7%
0.00200000009514573
 
5.3%
0.0109999999412264
 
4.5%
0.981999993310761
 
3.9%
0.076999999587936
 
2.9%
0.004000000197317
 
2.7%
0.0049999998885574
 
2.0%
0.019999999554902
 
1.8%
Other values (581)133104
48.7%
ValueCountFrequency (%)
030085
11.0%
0.00100000004731087
11.4%
0.00200000009514573
5.3%
0.00300000002615471
5.7%
0.004000000197317
 
2.7%
ValueCountFrequency (%)
1125
< 0.1%
0.9990000129150
0.1%
0.9980000257311
0.1%
0.99699997916
 
< 0.1%
0.9959999919196
0.1%

NATIVE_PROB
Real number (ℝ)

Zeros 

Distinct68
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.002862703912
Minimum0
Maximum0.1860000044
Zeros97216
Zeros (%)35.6%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:38.861792image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.001000000047
Q30.003000000026
95-th percentile0.01099999994
Maximum0.1860000044
Range0.1860000044
Interquartile range (IQR)0.003000000026

Descriptive statistics

Standard deviation0.006095161886
Coefficient of variation (CV)2.129162524
Kurtosis127.418397
Mean0.002862703912
Median Absolute Deviation (MAD)0.001000000047
Skewness8.884767455
Sum781.730008
Variance3.715099841 × 10-5
MonotonicityNot monotonic
2025-09-30T03:18:39.053142image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
097216
35.6%
0.00100000004753288
19.5%
0.00200000009536976
 
13.5%
0.00300000002621369
 
7.8%
0.00499999988819152
 
7.0%
0.0040000001911316
 
4.1%
0.0109999999410956
 
4.0%
0.0060000000526068
 
2.2%
0.0070000002163125
 
1.1%
0.01200000012479
 
0.9%
Other values (58)11129
 
4.1%
ValueCountFrequency (%)
097216
35.6%
0.00100000004753288
19.5%
0.00200000009536976
 
13.5%
0.00300000002621369
 
7.8%
0.0040000001911316
 
4.1%
ValueCountFrequency (%)
0.18600000442
 
< 0.1%
0.12999999521
 
< 0.1%
0.12600000296
< 0.1%
0.1256
 
< 0.1%
0.10700000081
 
< 0.1%

MULTIPLE_PROB
Real number (ℝ)

Zeros 

Distinct224
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.02166355277
Minimum0
Maximum0.7269999981
Zeros59816
Zeros (%)21.9%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:39.412283image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.001000000047
median0.009999999776
Q30.03099999949
95-th percentile0.06700000167
Maximum0.7269999981
Range0.7269999981
Interquartile range (IQR)0.02999999944

Descriptive statistics

Standard deviation0.03802145014
Coefficient of variation (CV)1.755088398
Kurtosis48.01212236
Mean0.02166355277
Median Absolute Deviation (MAD)0.009999999776
Skewness5.679029795
Sum5915.75301
Variance0.00144563067
MonotonicityNot monotonic
2025-09-30T03:18:39.607432image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
059816
21.9%
0.00100000004717014
 
6.2%
0.0040000001915442
 
5.7%
0.0020000000959708
 
3.6%
0.014000000439698
 
3.6%
0.008000000389162
 
3.4%
0.0060000000528236
 
3.0%
0.010999999946456
 
2.4%
0.043999999766138
 
2.2%
0.030999999495650
 
2.1%
Other values (214)125754
46.1%
ValueCountFrequency (%)
059816
21.9%
0.00100000004717014
 
6.2%
0.0020000000959708
 
3.6%
0.0030000000264024
 
1.5%
0.0040000001915442
 
5.7%
ValueCountFrequency (%)
0.72699999811
 
< 0.1%
0.63200002911
 
< 0.1%
0.61199998861
 
< 0.1%
0.56199997661
 
< 0.1%
0.52100002777
< 0.1%

PRESTIGE
Real number (ℝ)

Distinct4622
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4890997578
Minimum-0.4861144423
Maximum1.221478224
Zeros452
Zeros (%)0.2%
Negative46553
Negative (%)17.0%
Memory size2.1 MiB
2025-09-30T03:18:39.805663image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-0.4861144423
5-th percentile-0.240483731
Q10.2873032093
median0.5957869291
Q30.7953392267
95-th percentile0.9192367792
Maximum1.221478224
Range1.707592666
Interquartile range (IQR)0.5080360174

Descriptive statistics

Standard deviation0.3723869814
Coefficient of variation (CV)0.7613722466
Kurtosis-0.4988840956
Mean0.4890997578
Median Absolute Deviation (MAD)0.2239435911
Skewness-0.8215096006
Sum133560.4273
Variance0.1386720639
MonotonicityNot monotonic
2025-09-30T03:18:40.018519image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.641285061810759
 
3.9%
0.58053368336300
 
2.3%
0.93018412595109
 
1.9%
0.59578692914914
 
1.8%
0.65926361083864
 
1.4%
0.89607948063763
 
1.4%
0.25172159083750
 
1.4%
0.64929437643340
 
1.2%
-0.31391763693232
 
1.2%
0.8458366993120
 
1.1%
Other values (4612)224923
82.4%
ValueCountFrequency (%)
-0.48611444231
< 0.1%
-0.416697831
< 0.1%
-0.41529595851
< 0.1%
-0.41473001241
< 0.1%
-0.40562698251
< 0.1%
ValueCountFrequency (%)
1.2214782241
 
< 0.1%
1.21993780150
< 0.1%
1.1642940041
 
< 0.1%
1.14940023488
< 0.1%
1.1314681771
 
< 0.1%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:40.198675image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length11
Median length6
Mean length6.346371313
Min length3

Characters and Unicode

Total characters1733029
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowempty
2nd rowempty
3rd rowempty
4th rowempty
5th rowempty
ValueCountFrequency (%)
doctor100998
36.6%
master72071
26.1%
bachelor69160
25.1%
mba15317
 
5.5%
empty12547
 
4.5%
high2910
 
1.1%
school2910
 
1.1%
associate71
 
< 0.1%
2025-09-30T03:18:40.554033image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o277047
16.0%
r242229
14.0%
t185687
10.7%
c173139
10.0%
e153849
8.9%
a141302
8.2%
D100998
 
5.8%
M87388
 
5.0%
B84477
 
4.9%
h74980
 
4.3%
Other values (11)211933
12.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)1733029
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o277047
16.0%
r242229
14.0%
t185687
10.7%
c173139
10.0%
e153849
8.9%
a141302
8.2%
D100998
 
5.8%
M87388
 
5.0%
B84477
 
4.9%
h74980
 
4.3%
Other values (11)211933
12.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1733029
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o277047
16.0%
r242229
14.0%
t185687
10.7%
c173139
10.0%
e153849
8.9%
a141302
8.2%
D100998
 
5.8%
M87388
 
5.0%
B84477
 
4.9%
h74980
 
4.3%
Other values (11)211933
12.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1733029
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o277047
16.0%
r242229
14.0%
t185687
10.7%
c173139
10.0%
e153849
8.9%
a141302
8.2%
D100998
 
5.8%
M87388
 
5.0%
B84477
 
4.9%
h74980
 
4.3%
Other values (11)211933
12.2%

EXPERTISE
Text

Missing 

Distinct2505
Distinct (%)1.4%
Missing95382
Missing (%)34.9%
Memory size2.1 MiB
2025-09-30T03:18:40.780943image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length1882
Median length859
Mean length462.7336008
Min length4

Characters and Unicode

Total characters82224059
Distinct characters116
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1181 ?
Unique (%)0.7%

Sample

1st rowautocad release 12,autocad release 2000,paradox,microsoft powerpoint,autocad accurender,windows xp,windows,spanish,microsoft outlook,press releases,windows 7,windows vista,adobe photoshop,premise 4.0,microsoft word,facebook,microsoft publisher,microsoft sharepoint,lotus notes,news editpro,mac osx,telnet,customer relations,microsoft excel,remedy,event planning,jet,copy editing,pine,microsoft exchange,windows 2000,microsoft access,file maker pro,operating systems,autocad release 14,writing,wordperfect,autocad release 13,jda systems,adobe acrobat,clarisworks,autocad release 10,adobe reader 9,adobe illustrator,proofreading,unix,language skills,cintra cts,autocad release 11,autocad anderson windows
2nd rowautocad release 12,autocad release 2000,paradox,microsoft powerpoint,autocad accurender,windows xp,windows,spanish,microsoft outlook,press releases,windows 7,windows vista,adobe photoshop,premise 4.0,microsoft word,facebook,microsoft publisher,microsoft sharepoint,lotus notes,news editpro,mac osx,telnet,customer relations,microsoft excel,remedy,event planning,jet,copy editing,pine,microsoft exchange,windows 2000,microsoft access,file maker pro,operating systems,autocad release 14,writing,wordperfect,autocad release 13,jda systems,adobe acrobat,clarisworks,autocad release 10,adobe reader 9,adobe illustrator,proofreading,unix,language skills,cintra cts,autocad release 11,autocad anderson windows
3rd rowautocad release 12,autocad release 2000,paradox,microsoft powerpoint,autocad accurender,windows xp,windows,spanish,microsoft outlook,press releases,windows 7,windows vista,adobe photoshop,premise 4.0,microsoft word,facebook,microsoft publisher,microsoft sharepoint,lotus notes,news editpro,mac osx,telnet,customer relations,microsoft excel,remedy,event planning,jet,copy editing,pine,microsoft exchange,windows 2000,microsoft access,file maker pro,operating systems,autocad release 14,writing,wordperfect,autocad release 13,jda systems,adobe acrobat,clarisworks,autocad release 10,adobe reader 9,adobe illustrator,proofreading,unix,language skills,cintra cts,autocad release 11,autocad anderson windows
4th rowautocad release 12,autocad release 2000,paradox,microsoft powerpoint,autocad accurender,windows xp,windows,spanish,microsoft outlook,press releases,windows 7,windows vista,adobe photoshop,premise 4.0,microsoft word,facebook,microsoft publisher,microsoft sharepoint,lotus notes,news editpro,mac osx,telnet,customer relations,microsoft excel,remedy,event planning,jet,copy editing,pine,microsoft exchange,windows 2000,microsoft access,file maker pro,operating systems,autocad release 14,writing,wordperfect,autocad release 13,jda systems,adobe acrobat,clarisworks,autocad release 10,adobe reader 9,adobe illustrator,proofreading,unix,language skills,cintra cts,autocad release 11,autocad anderson windows
5th rowautocad release 12,autocad release 2000,paradox,microsoft powerpoint,autocad accurender,windows xp,windows,spanish,microsoft outlook,press releases,windows 7,windows vista,adobe photoshop,premise 4.0,microsoft word,facebook,microsoft publisher,microsoft sharepoint,lotus notes,news editpro,mac osx,telnet,customer relations,microsoft excel,remedy,event planning,jet,copy editing,pine,microsoft exchange,windows 2000,microsoft access,file maker pro,operating systems,autocad release 14,writing,wordperfect,autocad release 13,jda systems,adobe acrobat,clarisworks,autocad release 10,adobe reader 9,adobe illustrator,proofreading,unix,language skills,cintra cts,autocad release 11,autocad anderson windows
ValueCountFrequency (%)
release38030
 
0.9%
media28695
 
0.7%
24348
 
0.6%
team19841
 
0.5%
management19167
 
0.5%
business18306
 
0.4%
project16115
 
0.4%
creative13245
 
0.3%
and11423
 
0.3%
management,team11302
 
0.3%
Other values (27412)3913711
95.1%
2025-09-30T03:18:41.244809image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e8008557
 
9.7%
i6991180
 
8.5%
a6342387
 
7.7%
n6051576
 
7.4%
t5793013
 
7.0%
,5407046
 
6.6%
r4852987
 
5.9%
s4644111
 
5.6%
o4454135
 
5.4%
3934655
 
4.8%
Other values (106)25744412
31.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)82224059
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e8008557
 
9.7%
i6991180
 
8.5%
a6342387
 
7.7%
n6051576
 
7.4%
t5793013
 
7.0%
,5407046
 
6.6%
r4852987
 
5.9%
s4644111
 
5.6%
o4454135
 
5.4%
3934655
 
4.8%
Other values (106)25744412
31.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)82224059
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e8008557
 
9.7%
i6991180
 
8.5%
a6342387
 
7.7%
n6051576
 
7.4%
t5793013
 
7.0%
,5407046
 
6.6%
r4852987
 
5.9%
s4644111
 
5.6%
o4454135
 
5.4%
3934655
 
4.8%
Other values (106)25744412
31.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)82224059
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e8008557
 
9.7%
i6991180
 
8.5%
a6342387
 
7.7%
n6051576
 
7.4%
t5793013
 
7.0%
,5407046
 
6.6%
r4852987
 
5.9%
s4644111
 
5.6%
o4454135
 
5.4%
3934655
 
4.8%
Other values (106)25744412
31.3%

CURRENTINDUSTRY
Text

Missing 

Distinct215
Distinct (%)0.1%
Missing3625
Missing (%)1.3%
Memory size2.1 MiB
2025-09-30T03:18:41.599363image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length53
Median length43
Mean length18.17111958
Min length5

Characters and Unicode

Total characters4896190
Distinct characters52
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)< 0.1%

Sample

1st rowComputer Software
2nd rowComputer Software
3rd rowComputer Software
4th rowComputer Software
5th rowComputer Software
ValueCountFrequency (%)
52808
 
8.6%
services35451
 
5.7%
health27360
 
4.4%
care26672
 
4.3%
hospital24293
 
3.9%
education23232
 
3.8%
practice21469
 
3.5%
higher21148
 
3.4%
law18899
 
3.1%
management17264
 
2.8%
Other values (234)348621
56.5%
2025-09-30T03:18:42.162021image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e470769
 
9.6%
i442073
 
9.0%
a391151
 
8.0%
t379859
 
7.8%
n373521
 
7.6%
347768
 
7.1%
r298310
 
6.1%
o233012
 
4.8%
c221547
 
4.5%
s184157
 
3.8%
Other values (42)1554023
31.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)4896190
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e470769
 
9.6%
i442073
 
9.0%
a391151
 
8.0%
t379859
 
7.8%
n373521
 
7.6%
347768
 
7.1%
r298310
 
6.1%
o233012
 
4.8%
c221547
 
4.5%
s184157
 
3.8%
Other values (42)1554023
31.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)4896190
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e470769
 
9.6%
i442073
 
9.0%
a391151
 
8.0%
t379859
 
7.8%
n373521
 
7.6%
347768
 
7.1%
r298310
 
6.1%
o233012
 
4.8%
c221547
 
4.5%
s184157
 
3.8%
Other values (42)1554023
31.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)4896190
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e470769
 
9.6%
i442073
 
9.0%
a391151
 
8.0%
t379859
 
7.8%
n373521
 
7.6%
347768
 
7.1%
r298310
 
6.1%
o233012
 
4.8%
c221547
 
4.5%
s184157
 
3.8%
Other values (42)1554023
31.7%

IS_BAD_USER
Boolean

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size266.8 KiB
False
273059 
True
 
15
ValueCountFrequency (%)
False273059
> 99.9%
True15
 
< 0.1%
2025-09-30T03:18:42.319890image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Distinct475
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:42.695300image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length23
Median length23
Mean length23
Min length23

Characters and Unicode

Total characters6280702
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique198 ?
Unique (%)0.1%

Sample

1st row2025-08-30 00:00:00.000
2nd row2025-08-30 00:00:00.000
3rd row2025-08-30 00:00:00.000
4th row2025-08-30 00:00:00.000
5th row2025-08-30 00:00:00.000
ValueCountFrequency (%)
00:00:00.000273074
50.0%
2025-08-2940548
 
7.4%
2025-08-3030749
 
5.6%
2025-08-2813847
 
2.5%
2025-08-3112903
 
2.4%
2025-09-0111118
 
2.0%
2025-08-2610908
 
2.0%
2025-07-076419
 
1.2%
2025-05-056306
 
1.2%
2025-06-236063
 
1.1%
Other values (466)134213
24.6%
2025-09-30T03:18:43.433482image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
03101724
49.4%
2684309
 
10.9%
-546148
 
8.7%
:546148
 
8.7%
5299845
 
4.8%
273074
 
4.3%
.273074
 
4.3%
8189658
 
3.0%
188675
 
1.4%
381814
 
1.3%
Other values (4)196233
 
3.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)6280702
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
03101724
49.4%
2684309
 
10.9%
-546148
 
8.7%
:546148
 
8.7%
5299845
 
4.8%
273074
 
4.3%
.273074
 
4.3%
8189658
 
3.0%
188675
 
1.4%
381814
 
1.3%
Other values (4)196233
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)6280702
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
03101724
49.4%
2684309
 
10.9%
-546148
 
8.7%
:546148
 
8.7%
5299845
 
4.8%
273074
 
4.3%
.273074
 
4.3%
8189658
 
3.0%
188675
 
1.4%
381814
 
1.3%
Other values (4)196233
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)6280702
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
03101724
49.4%
2684309
 
10.9%
-546148
 
8.7%
:546148
 
8.7%
5299845
 
4.8%
273074
 
4.3%
.273074
 
4.3%
8189658
 
3.0%
188675
 
1.4%
381814
 
1.3%
Other values (4)196233
 
3.1%

PUB_NAME
Text

Missing 

Distinct24238
Distinct (%)9.1%
Missing8047
Missing (%)2.9%
Memory size2.1 MiB
2025-09-30T03:18:43.896704image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length263
Median length209
Mean length68.05756772
Min length1

Characters and Unicode

Total characters18037093
Distinct characters185
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4476 ?
Unique (%)1.7%

Sample

1st rowNew Payday Options for Making Ends Meet
2nd rowDailyPay CYCLE Feature
3rd rowHelp employees take control of their finances with DailyPay by giving them the financial flexibility they desire
4th rowPAYTECH January 2021 : Page 12
5th rowSVUS Awards® Winners Announced in Annual 2020 Business Awards
ValueCountFrequency (%)
the87588
 
3.4%
of81810
 
3.2%
and69777
 
2.7%
in64784
 
2.5%
a43779
 
1.7%
for42481
 
1.7%
to42100
 
1.6%
with21081
 
0.8%
20173
 
0.8%
on18400
 
0.7%
Other values (35753)2080462
80.9%
2025-09-30T03:18:44.584997image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2312201
 
12.8%
e1550043
 
8.6%
i1184131
 
6.6%
n1142688
 
6.3%
a1103517
 
6.1%
o1087105
 
6.0%
t1067083
 
5.9%
r928676
 
5.1%
s842308
 
4.7%
l578686
 
3.2%
Other values (175)6240655
34.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)18037093
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2312201
 
12.8%
e1550043
 
8.6%
i1184131
 
6.6%
n1142688
 
6.3%
a1103517
 
6.1%
o1087105
 
6.0%
t1067083
 
5.9%
r928676
 
5.1%
s842308
 
4.7%
l578686
 
3.2%
Other values (175)6240655
34.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)18037093
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2312201
 
12.8%
e1550043
 
8.6%
i1184131
 
6.6%
n1142688
 
6.3%
a1103517
 
6.1%
o1087105
 
6.0%
t1067083
 
5.9%
r928676
 
5.1%
s842308
 
4.7%
l578686
 
3.2%
Other values (175)6240655
34.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)18037093
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2312201
 
12.8%
e1550043
 
8.6%
i1184131
 
6.6%
n1142688
 
6.3%
a1103517
 
6.1%
o1087105
 
6.0%
t1067083
 
5.9%
r928676
 
5.1%
s842308
 
4.7%
l578686
 
3.2%
Other values (175)6240655
34.6%

PUB_DATE
Text

Missing 

Distinct4944
Distinct (%)2.3%
Missing57936
Missing (%)21.2%
Memory size2.1 MiB
2025-09-30T03:18:44.972602image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2151380
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique363 ?
Unique (%)0.2%

Sample

1st row2016-07-04
2nd row2021-03-02
3rd row2021-02-19
4th row2021-01-01
5th row2020-10-29
ValueCountFrequency (%)
2018-01-013562
 
1.7%
2020-01-013518
 
1.6%
2019-01-013109
 
1.4%
2013-01-012799
 
1.3%
2021-01-012668
 
1.2%
2009-01-012552
 
1.2%
2010-01-012392
 
1.1%
2017-01-012301
 
1.1%
2012-01-012172
 
1.0%
2015-01-011979
 
0.9%
Other values (4934)188086
87.4%
2025-09-30T03:18:45.497210image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0590128
27.4%
-430276
20.0%
1407152
18.9%
2370448
17.2%
960513
 
2.8%
358910
 
2.7%
851674
 
2.4%
647612
 
2.2%
445907
 
2.1%
544842
 
2.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)2151380
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0590128
27.4%
-430276
20.0%
1407152
18.9%
2370448
17.2%
960513
 
2.8%
358910
 
2.7%
851674
 
2.4%
647612
 
2.2%
445907
 
2.1%
544842
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2151380
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0590128
27.4%
-430276
20.0%
1407152
18.9%
2370448
17.2%
960513
 
2.8%
358910
 
2.7%
851674
 
2.4%
647612
 
2.2%
445907
 
2.1%
544842
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2151380
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0590128
27.4%
-430276
20.0%
1407152
18.9%
2370448
17.2%
960513
 
2.8%
358910
 
2.7%
851674
 
2.4%
647612
 
2.2%
445907
 
2.1%
544842
 
2.1%

PUB_DESCRIPTION
Text

Missing 

Distinct10707
Distinct (%)8.7%
Missing149852
Missing (%)54.9%
Memory size2.1 MiB
2025-09-30T03:18:45.954947image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length2048
Median length1725
Mean length356.6687686
Min length3

Characters and Unicode

Total characters43949439
Distinct characters183
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2392 ?
Unique (%)1.9%

Sample

1st rowThis insider guide to Stony Brook University in Stony Brook, NY, features more than 160 pages of in-depth information, including student reviews, rankings across 20 campus life topics, and insider tips from students on campus. Written by a student at Stony Brook, this guidebook gives you the inside scoop on everything from academics and nightlife to housing and the meal plan. Read both the good and the bad and discover if Stony Brook is right for you. One of nearly 500 College Prowler guides, this Stony Brook guide features updated facts and figures along with the latest student reviews and insider tips from current students on campus. Find out what it’s like to be a student at Stony Brook and see if Stony Brook is the place for you.
2nd rowPascale discusses a “make-or-break” moment in her career, and how her former career as an athlete has shaped her as a leader.
3rd rowAn Interview with Pascale Witz, Executive Vice President, Diabetes & Cardiovascular, Sanofi
4th rowAs one of Fortune’s Most Powerful Women, Pascale regularly contributes to their website. In this post, Pascale gives her advice to first-time managers.
5th rowAbstract: The neocortex contains excitatory neurons and inhibitory interneurons. Clones of neocortical excitatory neurons originating from the same progenitor cell are spatially organized and contribute to the formation of functional microcircuits. In contrast, relatively little is known about the production and organization of neocortical inhibitory interneurons. We found that neocortical inhibitory interneurons were produced as spatially organized clonal units in the developing ventral telencephalon. Furthermore, clonally related interneurons did not randomly disperse but formed spatially isolated clusters in the neocortex. Individual clonal clusters consisting of interneurons expressing the same or distinct neurochemical markers exhibited clear vertical or horizontal organization. These results suggest that the lineage relationship plays a pivotal role in the organization of inhibitory interneurons in the neocortex.
ValueCountFrequency (%)
the331878
 
5.0%
and218522
 
3.3%
of206640
 
3.1%
to157856
 
2.4%
a137311
 
2.1%
in136133
 
2.1%
for74639
 
1.1%
is59943
 
0.9%
that49837
 
0.8%
on47795
 
0.7%
Other values (49715)5156906
78.4%
2025-09-30T03:18:46.625690image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6511731
14.8%
e4129383
 
9.4%
t2938649
 
6.7%
a2753425
 
6.3%
i2718988
 
6.2%
o2630762
 
6.0%
n2626018
 
6.0%
r2309110
 
5.3%
s2224447
 
5.1%
l1375968
 
3.1%
Other values (173)13730958
31.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)43949439
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
6511731
14.8%
e4129383
 
9.4%
t2938649
 
6.7%
a2753425
 
6.3%
i2718988
 
6.2%
o2630762
 
6.0%
n2626018
 
6.0%
r2309110
 
5.3%
s2224447
 
5.1%
l1375968
 
3.1%
Other values (173)13730958
31.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)43949439
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
6511731
14.8%
e4129383
 
9.4%
t2938649
 
6.7%
a2753425
 
6.3%
i2718988
 
6.2%
o2630762
 
6.0%
n2626018
 
6.0%
r2309110
 
5.3%
s2224447
 
5.1%
l1375968
 
3.1%
Other values (173)13730958
31.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)43949439
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
6511731
14.8%
e4129383
 
9.4%
t2938649
 
6.7%
a2753425
 
6.3%
i2718988
 
6.2%
o2630762
 
6.0%
n2626018
 
6.0%
r2309110
 
5.3%
s2224447
 
5.1%
l1375968
 
3.1%
Other values (173)13730958
31.2%

AWARD_NAME
Text

Missing 

Distinct15802
Distinct (%)5.9%
Missing6483
Missing (%)2.4%
Memory size2.1 MiB
2025-09-30T03:18:46.993754image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length257
Median length172
Mean length45.23419395
Min length3

Characters and Unicode

Total characters12059029
Distinct characters170
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4636 ?
Unique (%)1.7%

Sample

1st row2019 Top 20 Digital Innovator in Benefits
2nd row2019 Top 20 Digital Innovator in Benefits
3rd row2019 Top 20 Digital Innovator in Benefits
4th row2019 Top 20 Digital Innovator in Benefits
5th row2019 Top 20 Digital Innovator in Benefits
ValueCountFrequency (%)
award70858
 
4.1%
51507
 
3.0%
of46198
 
2.7%
the41018
 
2.4%
in33840
 
1.9%
for32090
 
1.8%
and22793
 
1.3%
best16004
 
0.9%
new14449
 
0.8%
top13211
 
0.8%
Other values (13425)1399878
80.4%
2025-09-30T03:18:47.565650image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1479856
 
12.3%
e1009011
 
8.4%
a746531
 
6.2%
n740301
 
6.1%
i736120
 
6.1%
r710027
 
5.9%
o673561
 
5.6%
t630779
 
5.2%
s448923
 
3.7%
l376048
 
3.1%
Other values (160)4507872
37.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)12059029
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1479856
 
12.3%
e1009011
 
8.4%
a746531
 
6.2%
n740301
 
6.1%
i736120
 
6.1%
r710027
 
5.9%
o673561
 
5.6%
t630779
 
5.2%
s448923
 
3.7%
l376048
 
3.1%
Other values (160)4507872
37.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)12059029
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1479856
 
12.3%
e1009011
 
8.4%
a746531
 
6.2%
n740301
 
6.1%
i736120
 
6.1%
r710027
 
5.9%
o673561
 
5.6%
t630779
 
5.2%
s448923
 
3.7%
l376048
 
3.1%
Other values (160)4507872
37.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)12059029
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1479856
 
12.3%
e1009011
 
8.4%
a746531
 
6.2%
n740301
 
6.1%
i736120
 
6.1%
r710027
 
5.9%
o673561
 
5.6%
t630779
 
5.2%
s448923
 
3.7%
l376048
 
3.1%
Other values (160)4507872
37.4%

AWARD_COMPANY
Text

Missing 

Distinct8391
Distinct (%)3.5%
Missing31679
Missing (%)11.6%
Memory size2.1 MiB
2025-09-30T03:18:47.916190image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length246
Median length130
Mean length23.98495826
Min length1

Characters and Unicode

Total characters5789849
Distinct characters143
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1975 ?
Unique (%)0.8%

Sample

1st rowEmployee Benefit News
2nd rowEmployee Benefit News
3rd rowEmployee Benefit News
4th rowEmployee Benefit News
5th rowEmployee Benefit News
ValueCountFrequency (%)
60960
 
7.2%
of47305
 
5.6%
university23904
 
2.8%
the17039
 
2.0%
and13149
 
1.6%
awards12840
 
1.5%
society10710
 
1.3%
association10655
 
1.3%
new10448
 
1.2%
school10139
 
1.2%
Other values (7729)626050
74.2%
2025-09-30T03:18:48.488095image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
602564
 
10.4%
e480409
 
8.3%
i415986
 
7.2%
o372568
 
6.4%
a370102
 
6.4%
n362740
 
6.3%
r320239
 
5.5%
t312795
 
5.4%
s267131
 
4.6%
l195995
 
3.4%
Other values (133)2089320
36.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)5789849
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
602564
 
10.4%
e480409
 
8.3%
i415986
 
7.2%
o372568
 
6.4%
a370102
 
6.4%
n362740
 
6.3%
r320239
 
5.5%
t312795
 
5.4%
s267131
 
4.6%
l195995
 
3.4%
Other values (133)2089320
36.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)5789849
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
602564
 
10.4%
e480409
 
8.3%
i415986
 
7.2%
o372568
 
6.4%
a370102
 
6.4%
n362740
 
6.3%
r320239
 
5.5%
t312795
 
5.4%
s267131
 
4.6%
l195995
 
3.4%
Other values (133)2089320
36.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)5789849
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
602564
 
10.4%
e480409
 
8.3%
i415986
 
7.2%
o372568
 
6.4%
a370102
 
6.4%
n362740
 
6.3%
r320239
 
5.5%
t312795
 
5.4%
s267131
 
4.6%
l195995
 
3.4%
Other values (133)2089320
36.1%

AWARD_DESCRIPTION
Text

Missing 

Distinct8602
Distinct (%)6.6%
Missing141832
Missing (%)51.9%
Memory size2.1 MiB
2025-09-30T03:18:48.944915image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length2070
Median length984
Mean length200.8836881
Min length2

Characters and Unicode

Total characters26364377
Distinct characters181
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2657 ?
Unique (%)2.0%

Sample

1st rowWinner of Best Derivatives Provider, North America in 2011, 2013
2nd rowWinner of Best Derivatives Provider, North America in 2011, 2013
3rd rowWinner of Best Derivatives Provider, North America in 2011, 2013
4th rowWinner of Best Derivatives Provider, North America in 2011, 2013
5th rowWinner of Best Derivatives Provider, North America in 2011, 2013
ValueCountFrequency (%)
the217502
 
5.7%
and127689
 
3.3%
of119735
 
3.1%
in91789
 
2.4%
to82822
 
2.2%
for76368
 
2.0%
a66998
 
1.8%
is28370
 
0.7%
as27047
 
0.7%
award26932
 
0.7%
Other values (21559)2947559
77.3%
2025-09-30T03:18:49.769182image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3693661
14.0%
e2387026
 
9.1%
t1699242
 
6.4%
a1641916
 
6.2%
i1623962
 
6.2%
n1577838
 
6.0%
o1533452
 
5.8%
r1407228
 
5.3%
s1295108
 
4.9%
d792504
 
3.0%
Other values (171)8712440
33.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)26364377
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3693661
14.0%
e2387026
 
9.1%
t1699242
 
6.4%
a1641916
 
6.2%
i1623962
 
6.2%
n1577838
 
6.0%
o1533452
 
5.8%
r1407228
 
5.3%
s1295108
 
4.9%
d792504
 
3.0%
Other values (171)8712440
33.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)26364377
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3693661
14.0%
e2387026
 
9.1%
t1699242
 
6.4%
a1641916
 
6.2%
i1623962
 
6.2%
n1577838
 
6.0%
o1533452
 
5.8%
r1407228
 
5.3%
s1295108
 
4.9%
d792504
 
3.0%
Other values (171)8712440
33.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)26364377
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3693661
14.0%
e2387026
 
9.1%
t1699242
 
6.4%
a1641916
 
6.2%
i1623962
 
6.2%
n1577838
 
6.0%
o1533452
 
5.8%
r1407228
 
5.3%
s1295108
 
4.9%
d792504
 
3.0%
Other values (171)8712440
33.0%

AWARD_DATE
Text

Missing 

Distinct407
Distinct (%)0.2%
Missing46076
Missing (%)16.9%
Memory size2.1 MiB
2025-09-30T03:18:50.161127image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2269980
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st row2019-05-01
2nd row2019-05-01
3rd row2019-05-01
4th row2019-05-01
5th row2019-05-01
ValueCountFrequency (%)
2015-01-015714
 
2.5%
2016-01-014959
 
2.2%
2013-01-014814
 
2.1%
2020-01-014677
 
2.1%
2019-01-014080
 
1.8%
2014-01-013986
 
1.8%
2018-01-013775
 
1.7%
2012-01-013371
 
1.5%
2021-05-013370
 
1.5%
2017-01-013339
 
1.5%
Other values (397)184913
81.5%
2025-09-30T03:18:50.699030image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0698511
30.8%
1498545
22.0%
-453996
20.0%
2328699
14.5%
957377
 
2.5%
552655
 
2.3%
639154
 
1.7%
438211
 
1.7%
838200
 
1.7%
335504
 
1.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)2269980
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0698511
30.8%
1498545
22.0%
-453996
20.0%
2328699
14.5%
957377
 
2.5%
552655
 
2.3%
639154
 
1.7%
438211
 
1.7%
838200
 
1.7%
335504
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2269980
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0698511
30.8%
1498545
22.0%
-453996
20.0%
2328699
14.5%
957377
 
2.5%
552655
 
2.3%
639154
 
1.7%
438211
 
1.7%
838200
 
1.7%
335504
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2269980
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0698511
30.8%
1498545
22.0%
-453996
20.0%
2328699
14.5%
957377
 
2.5%
552655
 
2.3%
639154
 
1.7%
438211
 
1.7%
838200
 
1.7%
335504
 
1.6%

CITY
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:50.884913image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters3549962
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNew York City
2nd rowNew York City
3rd rowNew York City
4th rowNew York City
5th rowNew York City
ValueCountFrequency (%)
new273074
33.3%
york273074
33.3%
city273074
33.3%
2025-09-30T03:18:51.191296image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
546148
15.4%
N273074
7.7%
e273074
7.7%
w273074
7.7%
Y273074
7.7%
o273074
7.7%
r273074
7.7%
k273074
7.7%
C273074
7.7%
i273074
7.7%
Other values (2)546148
15.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)3549962
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
546148
15.4%
N273074
7.7%
e273074
7.7%
w273074
7.7%
Y273074
7.7%
o273074
7.7%
r273074
7.7%
k273074
7.7%
C273074
7.7%
i273074
7.7%
Other values (2)546148
15.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)3549962
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
546148
15.4%
N273074
7.7%
e273074
7.7%
w273074
7.7%
Y273074
7.7%
o273074
7.7%
r273074
7.7%
k273074
7.7%
C273074
7.7%
i273074
7.7%
Other values (2)546148
15.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)3549962
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
546148
15.4%
N273074
7.7%
e273074
7.7%
w273074
7.7%
Y273074
7.7%
o273074
7.7%
r273074
7.7%
k273074
7.7%
C273074
7.7%
i273074
7.7%
Other values (2)546148
15.4%

STATE
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:51.342362image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters2184592
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNew York
2nd rowNew York
3rd rowNew York
4th rowNew York
5th rowNew York
ValueCountFrequency (%)
new273074
50.0%
york273074
50.0%
2025-09-30T03:18:51.643367image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N273074
12.5%
e273074
12.5%
w273074
12.5%
273074
12.5%
Y273074
12.5%
o273074
12.5%
r273074
12.5%
k273074
12.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)2184592
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N273074
12.5%
e273074
12.5%
w273074
12.5%
273074
12.5%
Y273074
12.5%
o273074
12.5%
r273074
12.5%
k273074
12.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2184592
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N273074
12.5%
e273074
12.5%
w273074
12.5%
273074
12.5%
Y273074
12.5%
o273074
12.5%
r273074
12.5%
k273074
12.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2184592
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N273074
12.5%
e273074
12.5%
w273074
12.5%
273074
12.5%
Y273074
12.5%
o273074
12.5%
r273074
12.5%
k273074
12.5%

COUNTRY
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:51.787194image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters3549962
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowUnited States
4th rowUnited States
5th rowUnited States
ValueCountFrequency (%)
united273074
50.0%
states273074
50.0%
2025-09-30T03:18:52.075363image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t819222
23.1%
e546148
15.4%
U273074
 
7.7%
n273074
 
7.7%
i273074
 
7.7%
d273074
 
7.7%
273074
 
7.7%
S273074
 
7.7%
a273074
 
7.7%
s273074
 
7.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)3549962
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t819222
23.1%
e546148
15.4%
U273074
 
7.7%
n273074
 
7.7%
i273074
 
7.7%
d273074
 
7.7%
273074
 
7.7%
S273074
 
7.7%
a273074
 
7.7%
s273074
 
7.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)3549962
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t819222
23.1%
e546148
15.4%
U273074
 
7.7%
n273074
 
7.7%
i273074
 
7.7%
d273074
 
7.7%
273074
 
7.7%
S273074
 
7.7%
a273074
 
7.7%
s273074
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)3549962
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t819222
23.1%
e546148
15.4%
U273074
 
7.7%
n273074
 
7.7%
i273074
 
7.7%
d273074
 
7.7%
273074
 
7.7%
S273074
 
7.7%
a273074
 
7.7%
s273074
 
7.7%

USER_ID
Real number (ℝ)

Distinct4757
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean483499942
Minimum1241390
Maximum2225886059
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 MiB
2025-09-30T03:18:52.250345image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1241390
5-th percentile51293060
Q1228217596
median424195151
Q3613549428
95-th percentile1056527074
Maximum2225886059
Range2224644669
Interquartile range (IQR)385331832

Descriptive statistics

Standard deviation408882318.3
Coefficient of variation (CV)0.8456719075
Kurtosis7.17125062
Mean483499942
Median Absolute Deviation (MAD)189354277
Skewness2.393891773
Sum1.320312632 × 1014
Variance1.671847502 × 1017
MonotonicityNot monotonic
2025-09-30T03:18:52.440829image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
61354942810759
 
3.9%
2950244286300
 
2.3%
2756337345109
 
1.9%
5222212594914
 
1.8%
1255783673864
 
1.4%
5369298143763
 
1.4%
1773484863750
 
1.4%
6808149743340
 
1.2%
7078362923232
 
1.2%
3638468863120
 
1.1%
Other values (4747)224923
82.4%
ValueCountFrequency (%)
124139012
 
< 0.1%
124988660
< 0.1%
128432840
< 0.1%
149024430
< 0.1%
15113595
 
< 0.1%
ValueCountFrequency (%)
22258860594
< 0.1%
22257476441
 
< 0.1%
22255996361
 
< 0.1%
22252510202
< 0.1%
22242465121
 
< 0.1%